Overview

Dataset Statistics

Number of Variables 10
Number of Rows 101503
Missing Cells 22729
Missing Cells (%) 2.2%
Duplicate Rows 328
Duplicate Rows (%) 0.3%
Total Size in Memory 8.5 MB
Average Row Size in Memory 88.0 B
Variable Types
  • Numerical: 10

Dataset Insights

number_of_times_90_days_late and number_of_time_60_89_days_past_due_not_worse have similar distributions Similar Distribution
monthly_income has 20103 (19.81%) missing values Missing
number_of_dependents has 2626 (2.59%) missing values Missing
revolving_utilization_of_unsecured_lines is skewed Skewed
number_of_time_30_59_days_past_due_not_worse is skewed Skewed
debt_ratio is skewed Skewed
monthly_income is skewed Skewed
number_of_open_credit_lines_and_loans is skewed Skewed
number_of_times_90_days_late is skewed Skewed
number_real_estate_loans_or_lines is skewed Skewed
number_of_time_60_89_days_past_due_not_worse is skewed Skewed
number_of_dependents is skewed Skewed
revolving_utilization_of_unsecured_lines has 7311 (7.2%) zeros Zeros
number_of_time_30_59_days_past_due_not_worse has 85190 (83.93%) zeros Zeros
number_of_times_90_days_late has 95785 (94.37%) zeros Zeros
number_real_estate_loans_or_lines has 38066 (37.5%) zeros Zeros
number_of_time_60_89_days_past_due_not_worse has 96375 (94.95%) zeros Zeros
number_of_dependents has 58618 (57.75%) zeros Zeros
  • 1
  • 2

Variables


revolving_utilization_of_unsecured_lines

numerical

Approximate Distinct Count 85716
Approximate Unique (%) 84.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1624048
Mean 5.31
Minimum 0
Maximum 21821
Zeros 7311
Zeros (%) 7.2%
Negatives 0
Negatives (%) 0.0%
  • revolving_utilization_of_unsecured_lines is skewed right (γ1 = 58.3241)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0.03013
Median 0.1526
Q3 0.5642
95-th Percentile 1
Maximum 21821
Range 21821
IQR 0.5341

Descriptive Statistics

Mean 5.31
Standard Deviation 196.156
Variance 38477.1915
Sum 538980.9601
Skewness 58.3241
Kurtosis 4210.9048
Coefficient of Variation 36.9409
  • revolving_utilization_of_unsecured_lines is not normally distributed (p-value 4.2265336584317635e-25)
  • revolving_utilization_of_unsecured_lines has 493 outliers

age

numerical

Approximate Distinct Count 82
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1624048
Mean 52.4054
Minimum 21
Maximum 104
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • age is skewed right (γ1 = 0.1871)

Quantile Statistics

Minimum 21
5-th Percentile 29
Q1 41
Median 52
Q3 63
95-th Percentile 78
Maximum 104
Range 83
IQR 22

Descriptive Statistics

Mean 52.4054
Standard Deviation 14.7798
Variance 218.4412
Sum 5.3193e+06
Skewness 0.1871
Kurtosis -0.5035
Coefficient of Variation 0.282
  • age has 21 outliers

number_of_time_30_59_days_past_due_not_worse

numerical

Approximate Distinct Count 16
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1624048
Mean 0.4538
Minimum 0
Maximum 98
Zeros 85190
Zeros (%) 83.9%
Negatives 0
Negatives (%) 0.0%
  • number_of_time_30_59_days_past_due_not_worse is skewed right (γ1 = 20.9409)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 0
95-th Percentile 2
Maximum 98
Range 98
IQR 0

Descriptive Statistics

Mean 0.4538
Standard Deviation 4.5385
Variance 20.5979
Sum 46059
Skewness 20.9409
Kurtosis 446.776
Coefficient of Variation 10.0017
  • number_of_time_30_59_days_past_due_not_worse is not normally distributed (p-value 4.738016372567772e-25)
  • number_of_time_30_59_days_past_due_not_worse has 16313 outliers

debt_ratio

numerical

Approximate Distinct Count 79878
Approximate Unique (%) 78.7%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1624048
Mean 344.475
Minimum 0
Maximum 268326
Zeros 2775
Zeros (%) 2.7%
Negatives 0
Negatives (%) 0.0%
  • debt_ratio is skewed right (γ1 = 73.1898)

Quantile Statistics

Minimum 0
5-th Percentile 0.004383
Q1 0.1734
Median 0.3643
Q3 0.8516
95-th Percentile 2435
Maximum 268326
Range 268326
IQR 0.6782

Descriptive Statistics

Mean 344.475
Standard Deviation 1632.5952
Variance 2.6654e+06
Sum 3.4965e+07
Skewness 73.1898
Kurtosis 10099.0289
Coefficient of Variation 4.7394
  • debt_ratio is not normally distributed (p-value 4.239089901251397e-25)
  • debt_ratio has 21018 outliers

monthly_income

numerical

Approximate Distinct Count 11976
Approximate Unique (%) 14.7%
Missing 20103
Missing (%) 19.8%
Infinite 0
Infinite (%) 0.0%
Memory Size 1302400
Mean 6855.0356
Minimum 0
Maximum 7.727e+06
Zeros 1020
Zeros (%) 1.0%
Negatives 0
Negatives (%) 0.0%
  • monthly_income is skewed right (γ1 = 159.2129)

Quantile Statistics

Minimum 0
5-th Percentile 1340.95
Q1 3408
Median 5400
Q3 8200
95-th Percentile 14568.1
Maximum 7.727e+06
Range 7.727e+06
IQR 4792

Descriptive Statistics

Mean 6855.0356
Standard Deviation 36508.6004
Variance 1.3329e+09
Sum 5.58e+08
Skewness 159.2129
Kurtosis 29134.6735
Coefficient of Variation 5.3258
  • monthly_income is not normally distributed (p-value 4.226523371231217e-25)
  • monthly_income has 3368 outliers

number_of_open_credit_lines_and_loans

numerical

Approximate Distinct Count 56
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1624048
Mean 8.4535
Minimum 0
Maximum 85
Zeros 1250
Zeros (%) 1.2%
Negatives 0
Negatives (%) 0.0%
  • number_of_open_credit_lines_and_loans is skewed right (γ1 = 1.2252)

Quantile Statistics

Minimum 0
5-th Percentile 2
Q1 5
Median 7
Q3 11
95-th Percentile 18
Maximum 85
Range 85
IQR 6

Descriptive Statistics

Mean 8.4535
Standard Deviation 5.1441
Variance 26.4618
Sum 858057
Skewness 1.2252
Kurtosis 3.3015
Coefficient of Variation 0.6085
  • number_of_open_credit_lines_and_loans is not normally distributed (p-value 4.30234129784462e-09)
  • number_of_open_credit_lines_and_loans has 2699 outliers

number_of_times_90_days_late

numerical

Approximate Distinct Count 18
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1624048
Mean 0.2967
Minimum 0
Maximum 98
Zeros 95785
Zeros (%) 94.4%
Negatives 0
Negatives (%) 0.0%
  • number_of_times_90_days_late is skewed right (γ1 = 21.3556)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 0
95-th Percentile 1
Maximum 98
Range 98
IQR 0

Descriptive Statistics

Mean 0.2967
Standard Deviation 4.5159
Variance 20.393
Sum 30115
Skewness 21.3556
Kurtosis 458.8198
Coefficient of Variation 15.2208
  • number_of_times_90_days_late is not normally distributed (p-value 4.282482038122857e-25)
  • number_of_times_90_days_late has 5718 outliers

number_real_estate_loans_or_lines

numerical

Approximate Distinct Count 24
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1624048
Mean 1.0131
Minimum 0
Maximum 37
Zeros 38066
Zeros (%) 37.5%
Negatives 0
Negatives (%) 0.0%
  • number_real_estate_loans_or_lines is skewed right (γ1 = 2.7901)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 1
Q3 2
95-th Percentile 3
Maximum 37
Range 37
IQR 2

Descriptive Statistics

Mean 1.0131
Standard Deviation 1.1103
Variance 1.2327
Sum 102830
Skewness 2.7901
Kurtosis 31.293
Coefficient of Variation 1.0959
  • number_real_estate_loans_or_lines is not normally distributed (p-value 3.3653688892530458e-16)
  • number_real_estate_loans_or_lines has 523 outliers

number_of_time_60_89_days_past_due_not_worse

numerical

Approximate Distinct Count 12
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1624048
Mean 0.2703
Minimum 0
Maximum 98
Zeros 96375
Zeros (%) 95.0%
Negatives 0
Negatives (%) 0.0%
  • number_of_time_60_89_days_past_due_not_worse is skewed right (γ1 = 21.5411)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 0
95-th Percentile 1
Maximum 98
Range 98
IQR 0

Descriptive Statistics

Mean 0.2703
Standard Deviation 4.5036
Variance 20.2822
Sum 27438
Skewness 21.5411
Kurtosis 464.3687
Coefficient of Variation 16.6603
  • number_of_time_60_89_days_past_due_not_worse is not normally distributed (p-value 4.248259718477243e-25)
  • number_of_time_60_89_days_past_due_not_worse has 5128 outliers

number_of_dependents

numerical

Approximate Distinct Count 13
Approximate Unique (%) 0.0%
Missing 2626
Missing (%) 2.6%
Infinite 0
Infinite (%) 0.0%
Memory Size 1582032
Mean 0.769
Minimum 0
Maximum 43
Zeros 58618
Zeros (%) 57.8%
Negatives 0
Negatives (%) 0.0%
  • number_of_dependents is skewed right (γ1 = 2.0759)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 1
95-th Percentile 3
Maximum 43
Range 43
IQR 1

Descriptive Statistics

Mean 0.769
Standard Deviation 1.1368
Variance 1.2923
Sum 76041
Skewness 2.0759
Kurtosis 22.2358
Coefficient of Variation 1.4782
  • number_of_dependents is not normally distributed (p-value 3.551254845332198e-22)
  • number_of_dependents has 9343 outliers

Interactions

Correlations

Missing Values